Sign Stable Projections, Sign Cauchy Projections and Chi-Square Kernels

نویسندگان

  • Ping Li
  • Gennady Samorodnitsky
  • John E. Hopcroft
چکیده

The method of stable random projections is popular for efficiently computing the lα distances in high dimension (where 0 < α ≤ 2), using small space. Because it adopts nonadaptive linear projections, this method is naturally suitable when the data are collected in a dynamic streaming fashion (i.e., turnstile data streams). In this paper, we propose to use only the signs of the projected data and analyze the probability of collision (i.e., when the two signs differ). We derive a bound of the collision probability which is exact when α = 2 and becomes less sharp when α moves away from 2. Interestingly, when α = 1 (i.e., Cauchy random projections), we show that the probability of collision can be accurately approximated as functions of the chi-square (χ) similarity. For example, when the (un-normalized) data are binary, the maximum approximation error of the collision probability is smaller than 0.0192. In text and vision applications, the χ similarity is a popular measure for nonnegative data when the features are generated from histograms. Our experiments confirm that the proposed method is promising for large-scale learning applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sign Cauchy Projections and Chi-Square Kernel

The method of stable random projections is useful for efficiently approximating the lα distance (0 < α ≤ 2) in high dimension and it is naturally suitable for data streams. In this paper, we propose to use only the signs of the projected data and we analyze the probability of collision (i.e., when the two signs differ). Interestingly, when α = 1 (i.e., Cauchy random projections), we show that t...

متن کامل

Sign Stable Random Projections for Large-Scale Learning

In this paper, we study the use of “sign α-stable random projections” (where 0 < α ≤ 2) for building basic data processing tools in the context of large-scale machine learning applications (e.g., classification, regression, clustering, and near-neighbor search). After the processing by sign stable random projections, the inner products of the processed data approximate various types of nonlinea...

متن کامل

Some new properties of biharmonic heat kernels

Contrary to the second order case, biharmonic heat kernels are sign-changing. A deep knowledge of their behaviour may however allow to prove positivity results for solutions of the Cauchy problem. We establish further properties of these kernels, we prove some Lorch-Szegö-type monotonicity results and we give some hints on how to obtain similar results for higher polyharmonic parabolic problems.

متن کامل

A model based, anatomy dependent method for ultra-fast creation of primary SPECT projections

  Introduction: Monte Carlo (MC) is the most common method for simulating virtual SPECT projections. It is useful for optimizing procedures, evaluating correction algorithms and more recently image reconstruction as a forward projector in iterative algorithms; however, the main drawback of MC is its long run time. We introduced a model based method considering the eff...

متن کامل

Level of Grammatical Proficiency and Acquisition of Functional Projections: The case of Iranian learners of English language

Unlike Lexical Projections, Functional Projections (Extended Projections) are more of an ‘abstract’ in nature. Therefore, Functional Projections seem to be acquired later than Lexical Projections by the L2 learners. The present study investigates Iranian L2 learners’ acquisition of English Extended Projections taking into account their level of grammatical proficiency. Specifically, the aim is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1308.1009  شماره 

صفحات  -

تاریخ انتشار 2013